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Abstract 

We analyze a simple network where a source and a receiver are connected by a line of erasure channels of 
different reliabilities. Recent prior work has shown that random linear network coding can achieve the min-cut 
capacity and therefore the asymptotic rate is determined by the worst link of the line network. In this paper we 
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• investigate the delay for transmitting a batch of packets, which is a function of all the erasure probabilities and the 

, number of packets in the batch. We show a monotonicity result on the delay function and derive simple expressions 
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which characterize the expected delay behavior of line networks. Further, we use a martingale bounded differences 
argument to show that the actual delay is tightly concentrated around its expectation. 

I. Introduction 

A common approach for practical network coding performs random linear coding over batches or 
generations [1], where the relevant delay measure is the time taken for the batch to be received. Such in- 
network coding is particularly beneficial in lossy networks [2] compared to end-to-end erasure coding. In 
this paper we investigate the batch end-to-end delay for lossy line networks. We consider the use of random 
linear network coding without feedback and a packet erasure model with different link qualities. All the 
nodes in the network store all the packets they receive and whenever given a transmission opportunity, 
send a random linear combination of all the stored packets [2], [3] over erasure links. 

Despite the extensive recent work on network coding over lossy networks (e.g. [2], [3], [4]) the expected 
time required to send a fixed number of packets over a network of erasure links is not completely 
characterized. Closely related work on delay in queueing theory [5], [6] assumes Poisson arrivals and 



their results pertain to the delay of individual packets in steady state and [7] examines the delay for 
a single queue multicasting to several users using block network coding. In our work, we consider a 
batch of n packets that need to be communicated over a line network of £ erasure links where each link 
experiences an erasure with probability pi,p2, ■ ■ ■ ,pe. and we are interested in the expected total time ET„ 
for the n packets to travel across the line network. 

Prior work [2], [3] established that random linear network coding can achieve the min-cut capacity and 
therefore the asymptotic rate is determined by the worst link of the line network. Therefore, the expected 
time ET„ for the n packets to cross the network is 

Tl 

Ern=- + D{n,pi,p2,...,pe), (1) 

1 — max Pi 
i<i<e 

where the delay function D(n,pi,p2, ■ ■ ■ ,Pi) is the sublinear part: 

^.^ D{n,pi,p2,...,pt) ^ Q 

n—»oo Infixed Tl 

However, relatively little is known about the delay function D(n,pi,p2, ■ ■ ■ ,pe)- 

In this work we characterize the delay function by showing that it is non-decreasing in n and is bounded 
by a simple function D{pi,p2, ■ ■ ■ ,Pe) of the link erasure probabilities. The main results of this paper are 
the following two theorems which characterize the expected behavior and show a concentration of the 
actual delay random variable close to this expectation. 

Theorem 1: Consider n packets communicated through a line network of I links with erasure 
probabilities Pi,P2, • • • and assume that there is a unique worst link: 

p^ := max Pi, Pi < p^ < 1 \/i ^m. 

l<i<£ 

The expected time ET„ to send all n packets is: 

ET„ = h D{n,pi,p2, . . . ,pe), 

1 — max Pi 
i<i<e 

where the delay function D{n,pi,p2, ■ ■ ■ ,pe) is non-decreasing in n and upper bounded by: 



D(pi,p2,...,pe) -.^ 2^ 
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If on the other hand there are two links that take the worst value, then the delay function is not bounded 
but still exhibits the sublinear behavior. Pakzad et al. [3] prove that in the case of a two-hop network with 
identical links the delay function grows as ^Jn. We also prove the following concentration result: 

Theorem 2: The time T„ for n packets to travel across the network is concentrated around its expected 
value with high probability. In particular for sufficiently large n: 



Since ET„ grows linearly in n and the deviations e„ are sublinear, T„ is tightly concentrated around its 
expectation for large n with probability approaching one. 

The remainder of this paper is organized as follows: Section |Il] presents the precise model we use for 
packet communication. Section [nl] presents the analysis for the general multi-hop network. Section |IV] 
contains a discussion of the results presented in this paper along with comments for future research. 



The general network under consideration is depicted in Fig. [TJ The network consists of £ + 1 nodes 
iV(*\ 1 < i < £ + 1, and £ links L^^ ,1 < i < £, with source node A^^^^ and destination node A^(^+^). Node 
N^^\ 1 < i < i is connected to node A^(*+^) to its right through the erasure link L^^\ 

We assume a discrete time model in which the source wishes to transmit n packets to the destination. 
At each time step, node N^^^ can transmit one packet through link L*^*) to node N^'^'^^\ 1 < i < i. The 
transmission succeeds with probability 1—pi or the packet gets erased with probability pi. Erasures across 
different links and time steps are assumed to be independent. At each time step the packet transmitted 
by node N^^^ is a random linear combination of all previously received packets at the node. We want to 
determine the time T„ taken for the destination node to receive (decode) all the n packets initially present 
at the source node N^'^\ We assume that no link fails with probability I {pi < 1, 1 < i <£) or else the 
problem becomes trivial since there are no packets traveling through the network. The destination node 
Ar(^+i) will decode once it receives n linearly independent combinations of the initial packets. 

Coding at each hop (network coding) is needed to achieve minimum delay when feedback is unavailable, 
slow or expensive. If instantaneous feedback is available at each hop an automatic repeat request (ARQ) 
scheme with simple forwarding of packets achieves a block delay performance identical to network coding. 




for deviations e„ = ri^/^/(l — maxpi). 



II. Model 
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Note that coding only at the source is suboptimal in terms of throughput and delay [2] . The only feedback 
required in the network coding case is that the destination node A^*^^+^), once it receives all the necessary 
linearly independent packets, signals the end of transmission to all the other nodes. 

As explained in [8], information travels through the network in the form of innovative packets. A packet 
at node N^''\ 2 < i < i is innovative if it does not belong to the space spanned by packets present at 
node iV'^*+^\ Each node needs to code, and therefore store, only the part of the information that has not 
already been received by N^''^^\ If feedback was present, nodes could equivalently drop packets that do 
not add information to the nodes on their right. Therefore the analysis becomes essentially a queueing 
theory problem for innovative packets. 

In our model, in case of a success the packet is assumed to be transmitted to the next node 
instantaneously, i.e. we ignore the transmission delay along the links. Moreover, there is no restriction on 
the number of packets n or the number of hops i, and there is no requirement for the network to reach 
steady state. 

Source Destination 

Fig. 1. Multi-hop network 



III. General Line Networks 

A. Proof of Theorem 1 

Let the random variable -R«\ 2 < i < i, denote the rank difference between node A^*^*) and node N^'^~^^\ 
at the moment packet n arrives at A^*^^^ . This is exactly the number of innovative packets present at node 
A^^*^ at the random time when packet n arrives at N^'^\ 

The time T„ taken to send n packets from the source node A^*^^) to the destination A^(^+^) can be 
expressed as the sum of time T^^^ required for all the n packets to cross the first link and the time r„ 
required for all the remaining innovative packets Rn^ , • • • , Rn'' at nodes A^*^^-* , . . . , A^^^-* respectively to 
reach the destination node A^(^+i): 

T„ = T^i) + x„. (2) 
All the quantities in equation Q are random variables and we want to compute their expected values. 



Due to the linearity of the expectation 

ETn = ETi^) + ETn (3) 
and by defining Xj^\ 1 < j < n to be the time taken for packet j to cross the first link, we get: 

since I < j < n, are all geometric random variables (P (^-^j^"* = = (1 — pi) ■ Pi^^,k > 1). 
Therefore combining equations (|3]) and (Hj) we get: 

ETW = -^+Er„. (5) 
l-pi 

Equations ©, ® give us 

71 Tl 

D{n,pi,p2,...,Pi) = hEr„, 

I — Pi 1 — max Pi 

\<i<l 

and clearly the key quantity for calculating the delay function D(n,pi,p2, ■ ■ ■ ,Pe) is the expected time 
Er„ taken for all the remaining innovative packets at nodes A^*^^) , • • • , N'^^^ to reach the destination. For 
the simplest case of a two-hop network (i = 2) we can derive recursive formulas for computing this 
expectation for each n. Table HIFA] has closed-form expressions for the delay function D(n,pi,p2) for 
n = 1, . . . , 4. It is seen that as n grows, the number of terms in the above expression increases rapidly, 

TABLE I 

The delay function D{n,pi,p2) for different values of n 



n 



D{n,pi,p2) 



T ; — r 



1 i \ 

1— pi 1— max(pi,p2) 1— P2 
2 2 |_ _2 ]_ 

l-max(pi,p2) 1-P2 1-P1P2 



2 



3 T^-^— 4— T + 



l+P2(2-pi(6-pi+(2-5pi)p2+(l-3(l-pi)pi)pi)) 



l-max(pi,p2) (1-P2)(1-P1P2) 



f 1 + P2(3 - Pl(ll +4p|p| +P2(5 + (5 -P2)P2) +P?P2(1 - P2{5 + 2p2{5 + 3p2))) \ 
\ -Pl(4 + P2(15 + P2(21 - (1 -P2)P2))) +Pi(l - P2(l -P2(31 + P2(5 + 4p2)))))) J 

(1-P2)(1-P1P2)^ 



4 -A 4 , ; 

1-pi l-max(pi,p2) (1-P2)(1-P1P2)'' 



making these exact formulas impractical, and as expected for larger values of £ (> 3) the situation only 
worsens. Our subsequent analysis derives tight upper bounds on the delay function D{n,pi,p2, ■ ■ ■ ,Pi) 
for any d. which do not depend on n. 

The (£ — 1) -tuple F„ = {Rn \ ■ ■ ■ .Rn^) representing the number of innovative packets remaining at 
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nodes N^'^\ . . . , N^^^ the moment packet n arrives at node A^*^^^ (including packet n) is a multidimensional 
Markov process with state space E C (the state space is a proper subset of N^^^ since F„ can never 
take the values (0, *, . . . , *)). Using the coupling method [9] and an argument similar to the one given at 
Proposition 2 in [10] it can be shown that Yn is a stochastically increasing function of n (meaning that 
as n increases there is a higher probability of having more innovative packets at nodes N^'^\ . . . , N^^^). 

Proposition 1: The Markov process F„ = {Rn \ • ■ • , Rri') is ^st-increasing. 

Proof: Given in the appendix along with the necessary definitions. ■ 

A direct result of Proposition [T] is that the expected time taken Er„ for the remaining packets at nodes 
N^'^\ . . . , A^(^) to reach the destination is a non-decreasing function of n: 

Er„ < Er„+i < lim Er„ (6) 

where in the second inequality is meaningful when the limit exists. 

Innovative packets travelling in the network from node N'^'^^ to node A^(^+^) can be viewed as customers 
travelling through a network of service stations in tandem. Indeed, each innovative packet (customer) 
arrives at the first station (node A^^^^) with a geometric arrival process and the transmission (service) time 
is also geometrically distributed. Once an innovative packet has been transmitted (serviced) it leaves the 
current node (station) and arrives at the next node (station) waiting for its next transmission (service). 

By using the interchangeability result on service station from Weber [11], we can interchange the 
position of any two links without affecting the departure process of node N^^'^ and therefore the delay 
function. Consequently, without loss of generality we can swap the position of the worst link in the queue 
(that is unique from the assumptions of Theorem [T]) with the first link leaving the positions of all other 
links unaltered, and therefore without loss of generality we can simply assume that the first link is the 
worst link (p2,P3, ■ ■ ■ ,Pe. < Pi < I). 

It is helpful to assume the first link to be the worst one in order to use the results of Hsu and Burke 
in [12]. The authors proved that a tandem network with geometrically distributed service times and a 
geometric input process, reaches steady state as long as the input process is slower than any of the service 
times. Our line network is depicted in Fig. [T] and the input process (of innovative packets) is the geometric 
arrival process at node A^*-^-* from N'^^\ Since P2,P3, ■ ■ ■ ,Pe < Pi the arrival process is slower than any 
service process (transmission of the innovative packet to the next hop) and therefore the network in Fig. [T] 
reaches steady state. 
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Sending an arbitrarily large number of packets {n oo) makes the problem of estimating lim ET„-if 

n— >oo 

the network was not reaching a steady state the above limit would diverge-the same as calculating the 
expected time taken to send all the remaining innovative packets at nodes A^^^^ , • • • , N^^^ to reach the 
destination A^(^+^) at steady state. This is exactly the expected end-to-end delay for a single customer in 
a line network that has reached equilibrium. This quantity has been calculated in [13] (page 67, Theorem 
4.10) and is equal to 

e 

lim Er„ = V . (7) 

Combining equations ([6]) and dT]) concludes the proof of Theorem [T] by changing pi to pm '■= maxpj < 1. 



B. Proof of concentration 

Here we present a martingale concentration argument. In particular we prove a slightly stronger version 
of Theorem 2: 

Theorem 3 (Extended version of Theorem 2): The time T„ for n packets to travel across the line 
network is concentrated around its expected value with high probability. In particular for sufficiently 
large n: 

2(1 — max pj) 2(1 — maxpj))ra^'^ 
P[|T„, - ET„| > e„] < + . 

for deviations e„ = n^/^+^/(l — maxpj), 5 G (0, 1/2). 

Proof: The main idea of the proof is to use the method of Martingale bounded differences [14]. 
This method works as follows: first we show that the random variable we want to show is concentrated is 
a function of a finite set of independent random variables. Then we show that this function is Lipschitz 
with respect to these random variables, i.e. it cannot change its value too much if only one of these 
variables is modified. Using this function we construct the corresponding Doob martingale and use the 
Azuma-Hoeffding [14] inequality to establish concentration. See also [15], [16] for related concentration 
results using similar martingale techniques. 

Unfortunately however this method does not seem to be directly applicable to Tn because it cannot 
be naturally expressed as a function of a bounded number of independent random variables. We use the 
following trick of showing concentration for another quantity first and then linking that concentration to 
the concentration of T„. 
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Specifically, we define Rt to be the number of innovative (linearly independent) packets received at the 
destination node A^(^+^) after t time steps. Rt is linked with T„ through the equation: 

Tn = MRt = n). (8) 

t 

The number of received packets is a well defined function of the link states at each time step. If there 
are i links in total, then: 

Rt 

where Zij,l < i < t and I < j < i, are equal to or 1 depending on whether link j is OFF or ON at 
time If a packet is sent on a link that is ON, it is received successfully; if sent on a link that is OFF, 
it is erased. It is clear that this function satisfies a bounded Lipschitz condition with a bound equal to 1: 

\g{zu, zii, Zij, zti, Zte) — 
g{zu, Zu, Zj^j, Zti, Zte)\ < 1. 

This is because if we look at the history of all the links failing or succeeding at all the t time slots, 
changing one of these link states in one time slot can at most influence the received rank by one. 

Using the Azuma-Hoeffding inequality (see the Appendix Theorem |4l) on the Doob martingale 
constructed by Rt = g{zu, zu, zti, Zti) we get following the concentration result: 

Proposition 2: The number of received packets Rt is a concentrated random variable around its mean 
value: 

1 fti 
F{\Rt- ERt\ >£t)<- where et = J-in{2t). (9) 



Proof: Given in the appendix. ■ 
Using this concentration and the relation dS]) between T„ and Rt we can show that deviations of the 



order st = \ ^in(2t) for Rt translate to deviations of the order of e„ = n^^'^'^^ /(I — max Pi) for T„. In 

V ^ i<i<e 

Theorem [3] smaller values 5 give tighter bounds that hold for larger n. Define the events: 

Ht = {\Rt-ERt\ <et} 
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and 

Ht = {\Rt-ERt\>et} 

and further define (u stands for upper bound) to be some t, ideally the smallest t, such that ERt—et > n 
and (/ stands for lower bound) to be some t, ideally the largest t, such that ERt + et < n. Then we 
have: 

where: 

• IP(^n > in\Hti) = since at time t = the destination has already received more than n innovative 
packets. Indeed given that Htu holds: n < Ei?tu — etn < Rt^ where the first inequality is due to the 
definition of f^. 

. P(T„ > rjHt^J < 1 

. FfHtn) < ^ due to equation 

Therefore: 

p(T. > < i (10) 

Similarly: 

P(T„>0 = P(T„><|i7,.)-P(iJ,J 

+ nTr.>ti\H,o-nHtO 

tjj the destination has already received less than n innovative 
Rtn < KRtu + 6t^ < n where the last inequality is due to the 



where: 

• P(T„ <tl\HtiJ =0 since at time t = 
packets. Indeed given that H^i^ holds: 
definition of tj^. 

• mtO < 1 

. P(T„, < ti\H,.J < 1 

. ¥(Hti ) < ^ due to equation dH). 
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Therefore: 

P(^n<<)<^. (11) 

Equations (fTOl) and (fTTI) show that the random variable T„ representing the time required for n packets 
to travel across a line network exhibits some kind of concentration between and t^, which are both 

functions of n. In the case of a line network, Ei?t = A ■ t — rit) where A = i\ — max Pi) is a constant 

i<i<£ 

equal to the capacity of the line network and r{t) is a bounded function representing the expected number 
of innovative packets that have crossed the first link (once again the worst link in the network has been 
positioned as the first link) by time t without having reached the destination. Since r{t) is bounded, a 
legitimate choice for large enough n for and is the following (see Lemma [T] in the Appendix): 

tl = {n + n'l'+'')/A, 5' G (0,1/2) (12) 

t = {n-n'/^+'')/A, (0,1/2) (13) 



From both (fTOj) and (fTT]): 



nt<Tn<tl) = l-P(T„<t^)-P(T„>0 



1 1 

> 1- — (14) 



n n 



and by substituting in (T4\i the t^, t^n from equations (fT2l) and (fT3l) we get: 



< r„ - - < — — > 1 



A A- A 



A A 



n - 71^1'^+^' n + nV2+5' 

and since ET„ = ^ + 0(1) we have: 



n 



P(|T„-ET„| < — — ) > 1 



1/2+^ 2A 2 An''' 



or 



T„-ETJ > — ^) < 



A n IT? — 



^1/2+5 2 A 2An^' 



A n r? — 



where b > b' and a simple substitution of A with (1 — maxpj) concludes the proof. 
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Fig. 2. The probability mass function of Tn of a two-liop networlc witli n — 50, pi — 0.5, P2 ~ 0.3 

IV. Discussion and Conclusions 

In this paper we analyzed the delay function and characterized its asymptotic behavior for an arbitrary 
set of erasure probabilities pi,p2, ■ ■ ■ ,pe that has a single worst link. The validity of our analysis is 
experimentally shown in Fig. 4 and 5. In particular, Fig. 4 shows the probability mass function (pmf) — 
computed via simulation — of T„ tightly concentrated around its expected value for a somewhat small 
value of n = 50. Fig. 5 shows the delay function D{n,pi,p2) rapidly approaching the computed bound 
D{pi,P2) as n grows (for pi = 0.5, p2 = 0.3). 

One limitation of our technique is the assumption of a single worst link. It is critical in our analysis 
because after bringing the worst link in the first position, it is equivalent to guaranteeing that all the other 
queues are bounded in expectation. If there is more than one bottleneck link the delay function can be 
unbounded [3] and the general behavior remains a topic for future work. Further understanding the delay 
function for more general networks is a challenging problem that might be relevant for delay critical 
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Fig. 3. The delay function D{n,p\,p2) for a two-hop network with pi = 0.5, p2 = 0.3 

applications. 

Acknowledgment 

This material is partly funded by subcontract #069153 issued by BAE Systems National Security 
Solutions, Inc. and supported by the Defense Advanced Research Projects Agency (DARPA) and the 
Space and Naval Warfare System Center (SPAWARSYSCEN), San Diego under Contract No. N66001- 
08-C-2013, and by Caltech's Lee Center for Advanced Networking. 

Appendix 

Definition 1: A binary relation ^ defined on a set P is called a preorder if it is reflexive and transitive, 

i.e. Va, b,cEP: 



a ^ a (reflexivity) 
{a ^ b) A {b ^ c) ^ a ^ c (transitivity) 



(15) 
(16) 



Definition 2: On the set N^^^ of all integer {£ — l)-tuples we define the regular preorder ^ that is 
Va, 6 G a ^ 6 iff ai < 6i, . . . , a^_i < where a = (ai, . . . , a^_i) and b = (bi, . . . , bg^i). Similarly 
we can define the preorder y. 
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Definition 3: A random vector X G N^^^ is said to be stochastically smaller in the usual stochastic 
order than a random vector Y G (denoted by X ^^t Y) if: Vtu G P(X h 00) < ¥{Y >z 00). 

Definition 4: A family of random variables {Yn}neN is called stochastically increasing (^st-increasing) 
if Yk ^st Yn whenever k < n. 

Proof: [Proof of Proposition [H Markov process {Yn,n > 1}, is a multidimensional process on 
E = N^^^ representing the number of innovative packets at nodes N^'^\ . . . , A^*^^) when packet n arrives 
at N^'^\ To prove that the Markov process {F„,n > 1} is stochastically increasing we introduce two 
other processes n> 1} and {Zn,n > 1} having the same state space and transition probabilities as 
{Y^,n>l}. 

More precisely, Markov process {l^,n > 1} is effectively observing the evolution of the number 
of innovative packets present at every node of the tandem queue. We define the two new processes 
{Xn, n > 1} and {Zn,n > 1} to observe the evolution of two other tandem queues having the same link 
failure probabilities as the queue of {Yn,n > 1}. 
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Observed by n > 1 } 



Observed by {X^, n > 1 } 



Observed by {Z„, n > 1 } 



Fig. 4. Multi-hop network with the corresponding Markov chains 



As seen in Fig. |4l at each time step and at every link, the queues for {Xn,n > 1} and {Zn,n > 1} 
either both succeed or a fail together. Moreover the successes or failures on each link on the queues 
observed by {Xn,n > 1} and {Zn,n > 1} are independent of the successes or failures on the queue 
observed by {Yn,n > 1}. Formally the joint process {{Xn,Zn),n > 1} constitute a coupling meaning 
that marginally each one of {X„,n > 1} and {Zn,n > 1} have the transition matrix Py of {Yn,n > 1}. 
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If Markov processes {X„,n > 1} and {Zn,n > 1} have different initial conditions then the following 
relation holds: 

Xi ^ ^ X„ ^ (17) 

The proof of the above statement is very similar to the proof of Proposition 2 in [10]. Essentially 
relation (flTI) states that since at both queues all links succeed or fail together the queue that holds more 
packets at each node initially (n = 1) will also hold more packets subsequently (n > 1) at every node. 

The initial state Yi of Markov process {F„,n > 1} is state a = (1,0, ... ,0) that is also called the 
minimal state since any other state is greater than the minimal state. To prove Proposition \T\ we set both 
processes n > 1} and n > 1} to start from the minimal state (Yi = 5^, Xi = 6a where = means 
equality in distribution), whereas process {Zn,n > 1} has initial distribution /i that is the distribution of 
process {¥„,, n > 1} after {n — k) steps (yU = Fy'^'Sa and Zi = fi). Then for every uj in the state space of 
{Yn,n > 1} we get: 

P(X„ huj)= F(Yn huj) = P(Zfc h u) (18) 

where the first equality holds since the two processes have the same distribution-both start from the 
minimal element and have the same transition matrices-and the second equality holds since 

Moreover due to the definition of the minimal element, Xi ^ Zi and using (fTTl ) we get X„ ^ Z„. 
Therefore 

¥{Zkhuj)>F{Xkhuj) =FiYkhuj). (19) 

The last equality follows from the fact that the two distributions have the same law. Equations (fTSl ) and 
(fT9l ) conclude the proof. ■ 

Definition 5: A sequence of random variables Vq, Vi, . . . is said to be a martingale with respect to 
another sequence Uq, Ui, . . . if, for all n > 0, the following conditions hold: 

. E[\Vn\] < OO 

. E[Vn+l\Uo,...,Un]=Vn 
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A sequence of random variables Vq, Vi, . . . is called martingale when it is a martingale with respect to 
itself. That is: 



for some constants and for some random variables Bf, that may be a function of Xq, ...,Xk-i. Then 
for all t > and any A > 0, 



Proof: [Proof of Proposition [21 The proof is based on the fact that from a sequence of random 
variables Ui, U2, . . . ,Un and any function / it's possible to define a new sequence Vq, . . . ,Vn 



that is a martingale (Doob martingale). Using the identity E[\/|iy] = E[E[y|t/, l^jlPF] it's easy to verify 
that the above sequence Vq, . . . , Ki is indeed a martingale. Moreover if function / is c-Lipschitz and 
Ui, . . . ,Un are independent it can be proved that the differences Vi — Vi-i are restricted within bounded 
intervals [14] (pages 305-306). 

Function Rt = g{zii, z^) has a bounded expectation, is 1-Lipschitz and the random variables Zij are 
independent and therefore all the requirements of the above analysis hold. Specifically by setting 



. E[|K|] < 00 

. E[K+i|H,...,K] = K 



Theorem 4: (Azuma-Hoeffding Inequality): Let Xq, Xi,...,X„ be a martingale such that 



-Bfc < Xfc — Xfe_i < Bk + dk 




Proof: Theorem 12.6 in [14] 



Vo = E[f{U,,...,U^)] 

Vi = E[f{Uu...,Un)\Ui,...,U4 



Gh = ng{zu,... 




/z-terms in total 
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we can apply the Azuma-Hoeffding inequality on the GQ,...,Gt£ martingale and we get the following 
concentration result 

2\2 

n\Gti - Gol > A] = ¥[\Rt - E[i?,]| > A] < 2exp{-— }. (20) 
The equality above holds since 
. Go = nRt] 

• Gte = Rt (the random variable itself) 



and by substituting on ^) A with et = J ^in(2t) 



\Rt-E[Rt]\>et] < ^ 



Lemma 1: When the expected number of innovative packets ¥.Rt received at the destination by time t 
is given by Ei?^ = A ■ t — r{t) where A is a constant and r{t) is a bounded function then one legitimate 
choice for and £ is: 



tl = {n + n'l'+'')/A, 5' €(0,1/2) 
t^ = (n-nV2+^')/A '^'e (0,1/2) 



Proof: The only requirement for is that it is a t such that Ei?t — tt> n. This is indeed true for 
large enough n if we substitute with (n + ■n}/'^^^')/A: 



mt^] -^ii>n^ Atl - ritl) -e,u>n^ Ail " Atl) - \\ -^in{2tl) > n 



, n + U(n + ni/2+<5) 2(n + ni/2+5) 

^ A --''S - V A ' ^ 

V yi 

1/24-^ /Jn + ni/2+5) 2(n + ni/2+<5) /£(l + n^"V2) 2(n + ni/2+<5) 
n^/'-^' > \/^^^4 J -) + n'/'+' > £n{^ L) + b 



s ,h1 + ^'^"'/'^ ,2(n + nV2+5) B 



> \^^71 + 



2A A ' ni/2 



where B is the upper bound of the function r{t) and the last equation holds for large enough n. 
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Similarly tj^ is a t such that Ei?^ + et < n. This is indeed true for large enough n if we substitute 

with (n-ni/2+<5')/A: 



^_^l/2+5 /£(n-nl/2+5) ^ 9(^-^1/2+5) 

^ ^ + V 2^ ^ > 2 

V ^ A ^ - 
V ^^^^ J -) < ^ ^ ^""^ A - 



^ A 

where the last inequality holds for large enough n. 
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